Goto

Collaborating Authors

 strong semi-supervised learner


Review for NeurIPS paper: Big Self-Supervised Models are Strong Semi-Supervised Learners

Neural Information Processing Systems

Weaknesses: - Most major parts in this work, such as distillation, fine-tuning are proposed in previous works. Although the authors improve SimCLR and proposed SimCLR V2, the novelties in individual parts are somehow limited. However, I think the simple semi-supervised framework is still valuable for industry and future works. It explicitly points out a previously ignored paradigm in semi-supervised visual learning where regularization based methods dominates. I think it will inspire several future works following the paradigm.


Review for NeurIPS paper: Big Self-Supervised Models are Strong Semi-Supervised Learners

Neural Information Processing Systems

All reviewers agree that this work pushes current SOTA on ImageNet benchmark for semi-supervised learning. Though the method is incremental, the overall framework shows very strong experimental results. Thus, all reviews tend to accept the paper. The concerns of R1 in the discussion part have been taken into consideration for final decision.


Big Self-Supervised Models are Strong Semi-Supervised Learners

Neural Information Processing Systems

One paradigm for learning from few labeled examples while making best use of a large amount of unlabeled data is unsupervised pretraining followed by supervised fine-tuning. Although this paradigm uses unlabeled data in a task-agnostic way, in contrast to common approaches to semi-supervised learning for computer vision, we show that it is surprisingly effective for semi-supervised learning on ImageNet. A key ingredient of our approach is the use of big (deep and wide) networks during pretraining and fine-tuning. We find that, the fewer the labels, the more this approach (task-agnostic use of unlabeled data) benefits from a bigger network. After fine-tuning, the big network can be further improved and distilled into a much smaller one with little loss in classification accuracy by using the unlabeled examples for a second time, but in a task-specific way.